Wednesday, December 15, 2004
Get your units correct
Sunday, November 21, 2004
Hierarchical VMC (was Probabilistic Functions III)
I had wondered whether it was possible for VMC to get as good of results as DMC. In theory, of course, the answer is "yes". Provided there is enough variational freedom. This is the hard part.
On to the second point.
I had proposed a variational wavefunction consisting of the sum of gaussians. The centers and widths were then the parameters to be optimized. The starting center points were to be generated by sampling from a more traditional wavefunction. Why bother? Why not start from a uniform distribution?
My speculative answer is that it would be horribly inefficient, and very likely to follow wrong paths (since the wavefunction is so "floppy"). So starting from a traditional wavefunction is a form of accelerating the convergence of the optimization procedure.
The idea seems to be one of hierarchical VMC, or of a hierarchical wavefunction that can be optimized in steps. The coarse parts are optimized first, and then the finer details are added later. Vaguely like a wavelet decomposition or a multigrid method deals with spatial resolution.
It still remains to be seen if the sum-of-gaussians wavefunction survives contact with reality.
Thursday, November 18, 2004
Probabilistic Functions II
I was confused because I had a set of sample points, but also a function using those points that could be considered a variational trial function in its own right.
The width of gaussians can be taken as a variational parameter, and optimized.
The idea for a MC scheme would be to generate a set of sample points from a trial function. Then the diffusion algorithm would be stepped forward in time to get a better approximation to the wave function.
But we can go further and consider all the center points as variational parameters as well. And we don't actually care about the diffusion part - it is only a device to relax the wavefunction to the ground state. We could use any optimization method instead. Now this form for the wavefunction is likely to have lots linear dependencies. Some optimization methods may have problems with this.
One could simply do a random walk - try moving a point, evaluate the energy (with VMC), and accept or reject the move based on the energy difference. Could one get DMC-like quality from VMC?
I can see a couple possible problems, so far. Getting the boundary conditions and cusp conditions correct may be difficult. The "basis" functions don't need to be a set of gaussians - just positive functions with a center point and a width. Also, there may be so much variational freedom that convergence takes a long time.
Wednesday, November 17, 2004
Probabilistic Functions
If we want to compute a normalized average over a weight function f(x), this output is exactly what we want. Now suppose we want to compute derivatives of f(x) (the energy in QMC calculations). In VMC, we can open the box and there is a analytic form for f(x) sitting there that we can take derivatives. But in DMC, the situation may not be so simple - f(x) may be determined solely by the distribution of x's. (I was trying to figure out how to compute the energy when doing DMC w/o a trial function)
One way to compute the derivative of f(x) is to keep a histogram of x's and take the derivative after the fact. However, this solution seems noise prone, and not likely to scale well in multiple dimensions.
Each sample is a delta function. Integrals over the function become integrals over a sum of delta functions, centered around the sample points. What if we replaced the delta functions by a gaussian? (since that is delta function in the limit of vanishing width). Each point then represents a probability distribution with a finite width.
The first attempt at representing the ground state wavefunction of an infinite square well failed miserably - I could get any value of the energy I wanted by varying the width of the gaussians. The problem is the boundary conditions - the wavefunction must vanish at the edges. So then I tried using image gaussians (outside of the square well) to force the appropriate BC's. That worked much better. So well, in fact, that the value of the energy is quite good even when the center points are drawn from a uniform distribution, although I suspect this is largely a feature of simplicity of the potential).
So what now?
With this sort of representation, I'm wondering if it's possible to use a forward Euler scheme to propagate the distribution forward in time (ie use a first order approximation to the time derivative in the diffusion eqn.). Either symbolically (ie, complicated formulas related to the original points) or by sampling (ie, generate a new set of points).
Normally, one thinks of using orthogonal function expansions, since linearly dependent functions don't add anything new to the function. This sum of gaussians is definitely not orthogonal (well, they become more so at small widths), especially since the points are randomly distributed according to some distribution. Hmm. This contrast seems relevant somehow. The gaussians are all positive, and orthogonal functions usually have negative regions - I don't know if this is relevant or not.
Thursday, November 11, 2004
Comments for Everyone. A New Blog.
Also, feeling the need to speak of things other than QMC, I've started a different blog for such topics. (I will continue posting here, I just wanted a place for non-QMC or non-MC related items).
[Edit - removed link to personal blog. Email me if you really want to know]
Atomic and Molecular orbital applets
Thursday, November 04, 2004
DMC timestep and equilbration
In Chapter 12, p. 227, he points out that if we are only interested in the steady state, errors in the decay rate don't matter.
Chapter 13 talks about operator splitting, which looks very similar to how Diffusion Monte Carlo gets the short time approximation to the propagator. Section 13.4 talks about consistency - basically timestep error in the steady state.
I had the realization that one could use large timesteps in DMC for the equilibration part, and smaller timesteps once equilibrium was reached. More drastic population control measures are probably needed as well if large timesteps are used (at least I always had trouble maintaining a stable population of walkers if the timestep is too large).
This particular idea isn't terribly profound, but I'm finding it very enlighting to read about other methods for solving differential equations. I also like the presentation level of this book - lots of practical, concrete considerations as well as descriptions of the consequences of various equations.
Wednesday, November 03, 2004
Qumax - QMC code
I downloaded the code, but haven't tried compiling or running it yet.
Wednesday, October 27, 2004
Fermion Sign Problem Officially Hard
Do I believe it?
Hmmm. The results presented deal with a discrete problem. I'm not sure if the results apply to continuous systems as well or not. The whole notion of NP-completeness and combinatorial complexity deals with discrete systems. For continuous systems, information-based complexity (IBC) might be a better way of analyzing the problem. Note that the "fruit-fly" of IBC research is multidimensional integration.
However, the distinction between continuous and discrete might not be important. For instance, with global optimization, finding the locations of the local minima is "easy", but then one is left picking between many different discrete local minima to find the global minima (which makes it a potentially "hard" combinatorial problem - assuming the number of local minima grows exponentially with problem size).
I'm still perplexed as the origins of the difficulty of the FSP (actually, I don't understand the origin of difficulties for most NP problems.) So a quick look into the form of the integrals is in order.
We start with wanting the integral I = int f(x) dx, where x is really a vector of high dimension. Monte Carlo fits the bill nicely, having error bounds independent of dimension. But f(x) is strongly peaked - only small volumes contribute to most of the integral. To be efficient, we need importance sampling. The usually trick is to split f(x) into a product: p(x)g(x), where p(x) >= 0. Since p is positive, we can interpret it as a probability and use various sampling methods to get int p(x) g(x) dx / int p(x) dx.
We can get away with this in QMC by assuming that p(x) = psi^2(x), which is positive. But in Diffusion Monte Carlo, p(x) = psi(x) phi(x), where psi(x) is a guess and phi(x) is the exact ground state wavefunction. Now psi and phi will not have zeros at exactly the same location, so p(x) will have some negative regions.
The trick for dealing with negative p(x) is to split it once again into: sgn(p) abs(p) and sample abs(p), and move sgn(p) over with g(x). The integral now looks like
[int abs(p) sgn(p) g(x) / int abs(p)]/[int abs(p) sgn (p) / int abs(p)]
Here's where I get lost. One source of the problem could the denominator - since it is the average of a signed quantity, it could be small (at least relative to the errors) and magnify errors in the numerator? But the paper says that both the numerator and denominator have errors that grow exponentially in system size. So the denominator is an irritant, but not the root cause.
The error of the average sign (again from the paper) is related to the exponential of the free energy difference between the fermionic problem (with p(x)) and bosonic problem (with abs(p(x)). So the fermionic free energy is necessarily smaller because of cancellation induced by minus signs. But I still don't see how this makes the problem hard.
Why not add a sufficiently large constant to p to make it postive? Now there are no minus signs anywhere. But the essential features of the integrand haven't changed - so it can't be a "sign" problem. I guess I view the complexity of integration as coming from the unpredictable fluctations of the integrand - wildy fluctating integrands are "harder" to do than smooth integrands. Adding a constant doesn't change this.
Perhaps the antisymmetry requirements produce a function that has nearly equal positive and negative parts. Each one separately has a small relative error, but the nearly exact cancellation causes great magnification of the error (a well known numerical analysis bugaboo). Thus, the experiment of adding a constant doesn't change anything - it's added equally to both parts, and cancels exactly.
Lennard-Jones Server, Part II
Q. Why remote computation - why not just download a program to the users computer?
A. Have you ever downloaded software and tried to obtain all the dependencies? Hopefully, the client program would be much simpler. Other ideals are bug fixes and enhancements would take place on the server and automatically be picked up by the users (although this would require versioning to meet reproducibility requirements for researchers). The general idea of web services is somewhat of a holy grail for business types - getting paid every time someone uses the service, or getting paid on a subscription basis. Which is why word processing software is unlikely to take off as a web service any time soon. Getting someone to pay for a Lennard-Jones server is also unlikely.
Q. Why not just have a web page? Like Google, Amazon, etc.
A. A web service allows interfacing to these services under program control, which allows your computer to buy stuff on Amazon when it gets bored.
Q. Why a single component Lennard-Jones system? It's not exactly a hotbed of research.
A. No, but it is a testbed. New techniques are often tested on simple systems, and benchmark results would be useful. More importantly, it is a simple first test of the simulations over web services idea. I can see expanding to more complicated classical systems and QMC later.
Q. Can you explain more about accumulating a database of properties?
A. Yes. Aggregating data is a noble and useful goal, but it requires careful vetting of the input data. The idea behind a Lennard-Jones server is that users would only input physical parameters, not raw data. The data aggregation could then be automated, since the data (ie, the output of the program) should be of good quality (or at least of known quality).
The particular example I have in mind is computing free energies and the phase diagram. Computing a grid of pressures or energies at a number of state points is relatively easy, but computing free energies and such is harder - it can require integrating over a number of state points (if you use thermodynamic integration). Hopeful as more state points get put into the system, the free energies get more accurate. Calculations done during the idle time could be planned specifically to lower the errors on edges of the phases.
Now such a database will likely be inferior to targetted research trying to answer a particular question (ie, accurate freezing points, etc), but it seems like a good source of background information or a starting point. Also, the background or baseline information would be in electronic form, and as such be more useful than in a paper. (Not that papers aren't important - the raw data needs to be intrepreted and understood)
Q. Who would be likely users for such a service?
A. On the computation side, companies and researchers that use the results of simulations, but don't care about the details of the simulations. For researchers in the particular field the web service covers, it might not be so useful (the whole "not outsourcing your core competency" thing), although it would be useful for benchmarking and comparison of results.
On the data side, having a good collection of properties for the desired system would be useful for everyone.
Q. What would you do if it were the late 90's?
A. I would run out and register eSimulations.net.com and/or eWavefunctionsRUs.com, write a buzzword compliant business plan about the wonderful opportunies in B2C, B2B, R2B and R2R markets (the last two, made up on the spot, are Reseacher to Business and Researcher to Researcher.), and how actual experiments in the real world are passe, and the new e-world of the internet will make the real world obsolete. And try to get lots of VC money and go public with an overhyped IPO. Ah. For the good old days.
Sunday, October 24, 2004
Comments enabled
And I've started adding titles to the posts. Enjoy
Saturday, October 23, 2004
Lennard-Jones Server
If I could write blog entries as funny as Rory Blyth, I'd be famous. Well, maybe not. I'd have to draw cartoons, too. But to really be famous, I'd have to gratutiously link to Robert Scoble. (And get him to link back to this blog, the really hard part.) (For those who have non-computer-geek lives, Scoble wrote an article on how to get your blog noticed.)
The vaguely relevant part of that last paragraph is Rory's comments about XML Dev Con 2004. Some of the talks related to Web Services.
Web Services - what are they? Zef Hemel has a good description of Service Oriented Architecture, which basically means using web services (ie, Remote Procedure Calls over port 80) to construct an application.
That leads me to the Grid, of which web services seem to be a part. (But don't get me started on the popular analogy between the power grid and the computational grid. It's seductive and appealing, and perhaps even useful at a very abstract level, but the details break down badly if you push it very far (see the Scientific American article, among others.)
The grid concept seems to encompass four different things (as far as I can tell)
- Connecting geographically (and/or institutionally) separated supercomputers to work on a single calculation.
- Harvesting idle cycles on desktop PC's
- Running specific computations on remote machines (this is web services)
- Running arbitrary computations on remote machines (like the recent Sun announcement, but other companies have offered similar services previously)
Concerning items 3 and 4, I suspect web services will be more popular then running arbitrary computations remotely. If you have enough computational or data requirements to entertain thoughts of rent-a-FLOP (or rent-a-MIP), you will care about the environment and how well the code is optimized for the target architecture (or at least how predictable the run time is). But caring about such things makes buying "generic" computing power harder, or at least more complex.
But web services are (?) about hiding those nasty details. The provider can optimize the code for the target architecture or other details, and the user need not worry. It may be useful to divide the services into two kinds, one that requires lots of data, and the other that requires lots of compuations. For data-heavy services, the provider has lots of specialized data that is updated frequently (think Google, Amazon, Ebay, or stock market info), and delivering the data + program to the user would impractical. (versus something like TurboTax, which contains lots of specialized knowledge that changes yearly, but is practical to deliver the data + program to the user's computer. Although it can be and is offered over the web as well). For computation heavy services, optimization for particular hardware, etc could be considered specialized knowledge as well (making the divisons less distinct), but I still think the division may be useful.
As a way of understanding web services, and grid concepts, I was thinking of a Lennard-Jones server. A user could request properties of a single component system interacting via a Lennard-Jones potential for a given temperture/density/system size, the server would run the simulation and return the energy,pressure, g(r), and whatever else. (It would require limits on the system size to keep from completely overloading the server).
The server could have other features, such a keeping a database of run results, and returning the simulation results nearest the requrest parameters. Although I'm not sure this would work so well if one extended the idea to more complicated systems with larger parameter space.
In some ways, it's like a database of material properties, except that the properties are all generated by computation. So the database could be improved by further computation - if the server is idle it could work on reducing errors in various quantities. Although this might wreak havoc on reproduceability for users - good versioning would be needed.
Saturday, October 16, 2004
3D Function Plotter
In the continuing quest to visualize wavefunctions, I've put up the first version of a 3D function plotter.
Wednesday, September 29, 2004
How to Integrate a Derivative
In the previous post, I had tried to use derivatives to improve the estimate for a single sample, but that turned out not to work very well. That was an approximation. To use the derivative exactly, the definition is f(x) - f(0) = int_0^x f`(y) dy.
Does this have a practical application? After all, if we do this on the example in the previous post, we go from a single integral to a double integral - it looks like we're making the problem harder! Ah, but this is Monte Carlo. The convergence rate is independent of dimensionality. So adding more integrals doesn't necessarily make the problem harder.
The final error does, however, depend on the intrinsic variance of the integral. In this example, this variance gets much worse, so making this transformation is not a good idea. In general, derivatives will vary more than the original function, so this technique is not likely useful in general. Unless the transformed problem has some other desirable features, like better importance sampling or something.
Finally, I should note that something similar to this is used in thermodynamic integration to get free energies.
Monday, September 27, 2004
Sometimes it doesn't work
The other day I had an idea about trying to improve MC samples. The Monte Carlo method for evaluating an integral is the average of a set of samples. Each sample can be considered a very crude approximation to the integral - each sample approximates the integrand as a constant. What if we could make each sample a better approximation, say by using a the derivative information at the sample point?
Another consideration is that the intrinsic variance of the integral (int [f(x)-ave(f)]^2) controls the accuracy of the MC approximation. If we could smooth the integrand, and lower the variance, the MC integral would be better. Perhaps expanding the integral in a series around the sample point and integrating would help smooth it?
Then reality struck. I tried it on v(x)*exp(-v(x)/T), with v(x) = 1/x^12. I made the expansion and integrated in an interval from [x-h,x+h], and graphed the result - it was clearly *not* smoother than the original function. It was also clear that the expansion was an approximation, and only correct in the limit of small h.
Upon further reflection, I realized a few things. One is that each MC sample may be a crude approximation to the integral, but it has the important property of being unbiased (ie, all the errors will average to zero). Any improvement in the sample value needs to retain this property.
Secondly, smoothing would be most valuable when the integrand is rapidly changing. And this is just the condition where the Taylor expansion is likely to be the least accurate. I suppose that any local smoothing of the integrand will suffer from this sort of problem.
Sometimes, it doesn't work. Ah, but that's what research is all about.
Saturday, September 04, 2004
Visualizing Wavefunctions
I've started learning VTK, Python, and Tk in order to look into visualizing wavefunctions. So far, I've made an application that plots a function of x and y. The user can change the plotted function in an edit box. Get the 2D plotter here.
I've also got a prototype application that plots a function of 3 variables, and lets the user choose a slice (a plane) through the displayed volume, and the application plots the 2D slice in a view that looks very much like the 2D plotter. But it's not quite ready for public consumption yet.
Thursday, July 29, 2004
Welcome to the Metropolis Disco - How do you move your particles?
Out on the dance floor tonight we have the spotlight on two papers on improving the efficiency of Monte Carlo moves. But first, a brief summary of the results of Peskun's theorem (Both papers mention it.). The Metropolis algorithm takes a trial move and accepts or rejects the move with some probability. If the move is rejected, that location is used in the average again. Rejected moves will increase the autocorrelation, and so increase the variance of the resulting average. Peskun's theorem validates this intuition, under certain conditions. So the result is that, just like dancing, moving is better than standing still.
The first paper is Delayed Rejection Variational Monte Carlo by Bressanini, Morosi, and Tarasco. The problem is that for best efficiency, moves near an atomic nucleus should have small step, and moves far away from a nucleus should have a large step. If a move is rejected, it is likely that the step size was too large. One could imagine a dynamic or adaptive step size method that lowers the step size when there are lots of rejections and increases it when there are lots of acceptances (*). This paper does something slightly different - when a move is rejected, it is retried with a smaller step, but the original rejected move is not made part of the average. The acceptance probability is adjusted to take this into account. So there is an increase in the amount of time it takes for a move, but that is made up for by a larger decrease in the autocorrelation time.
The second paper is Optimal Monte Carlo Updating by Pollet, Rombouts, Van Houcke, and Heyde. They work with discrete systems, and introduce a a procedure called "Metropolizing" - which is to take a move rule (eg, heat bath) and trasform it to a new, more efficient transition matrix. This process can be applied repeated until there is only one non-zero diagonal element, resulting in a very efficient transition matrix (For single site updating rules).
(*) I'm not entirely sure a simple dynamic step size method would satisfy detailed balance. I think it would, but that should be checked.
Wednesday, July 21, 2004
More Graphic Card Computations
There's a site, GPGPU (General Purpose computation on Graphics Processing Units), dedicated to interesting things one can do with a graphics card. (Well, there's always graphics, but using things for their intended purpose can get boring :)
Last fall, I wondered about the possiblity of doing Monte Carlo on the graphics card.
On the GPGPU site there's a link to a paper about doing Ising model and percolation simulations on the graphics card (Benchmarking and Implementation of Probability-Based Simulations on Programmable Graphics Cards). The work is primarily for proof of concept and benchmarking, with an eye towards lattice QCD simulations eventually. Hmm. Play a killer game of Quake *and* uncover fundamental secrets of the universe - what a combination!
I can imagine students requesting the latest "Ising co-processors" so they can "make progress on their thesis" when Doom 3 comes out. (Err, maybe they need to leave off that last part when making the request).
Wednesday, May 12, 2004
PIMC live
Check out the Path Integral simulation applet on John Shumway's page. You can change the temperature and see the paths localize at high temperature and spread out at low temperature.
Friday, April 16, 2004
Pseudorandom number generators and the stock market
As correlations are discovered in psuedorandom number generators used in simulations, the correlations are removed by creating better generators. So as time marches on, the state of the art PRNG's should be getting more and more random.
As correlations are discovered in the stock market, they are removed as the resevoir of correlation is drained for its profit. So the stock market should be getting more and more random as time goes on (especially as faster computers and more sophisticated statistical techniques are used)
Thursday, March 11, 2004
Experimentally switching between liquid and solid phases?
Looking at Grochola's free energy path, I thought it was a nice computational technique.
The March 2004 issue of Physics Today has an article on controlling atoms and ions in optical traps (it's actually about using these for quantum information processing). On page 41, it describes putting the atoms in a optical lattice. The depth of the wells and the hopping matrix elements can be controlled by varying the intensity of the lasers creating the lattice.
This seems like one could experimentally switch between solid and liquid states without going through a phase transition.
There are some differences - I don't know if they are important or not. First, the experimental setup is a quantum system, Grochola's paper describes a classical system. Second, the confining potential is different. In the experimental setup, atoms are confined to any lattice site, but are not tethered to a particular lattice site.
The Physics Today article mentions a Mott insulator transition when the confining potential becomes strong enough. This transition seems due to the superfluid (quantum) nature of the liquid state. If the experiment were performed on a non-superfluid system, this transition would not happen ??
Update (3/16/2004): I forgot the final stage in the path - turning off the confining potential to get a stable solid state. I'm guessing the lattice spacing of the optical lattice is at least the laser wavelength or greater, so it's much larger than the lattice spacing of a normal solid. But some atoms can have their scattering length tuned - could it be made large enough to be comparable to the optical lattice spacing? In that case the atoms would behave as hard spheres (most importantly, there would likely not be enough attraction to keep the solid together), so an external potential would be required. Most of these setups seem to generate a harmonic confining potential (a box-like potential seems much harder to generate??), so the question is - what is the behavior of hard spheres in a harmonic potential?
For this system, there are two parameters - the number of particles and the strength of the external potential. I suspect there would be three regimes - one all liquid, one with a solid center surrounded by a liquid exterior, and one would be all solid. To take the thermodynamic (large N) limit, it seems that one would need to increase the particle number and decrease the external potential strength simultaneously to keep the central density or pressure constant (otherwise the pressure/density at the center would increase with added particles). Or maybe such a limit doesn't exist?
Wednesday, February 11, 2004
The Way of Antisymmetry
The most general form is obtained by explicitly antisymmetrizing: summing the values of a function with all possible permutations of the electron coordinates (and multiplying by the sign appropriate for the permutation). For example, with 2 particles and a function f(r1,r2), the antisymmetrized form is f(r1,r2) - f(r2,r1).
Unfortunately, the number of terms is N!. (The determinant form is obtained by applying this procedure when the function has the special form of the product of functions that each depend only on a single electron coordinate - these functions are the single particle orbitals. In the two particle example, f = f1(r1)f2(r2). The antisymmetrized form is f1(r1)f2(r2) - f1(r2)f2(r1), which is the determinant of a matrix.)
Dario Bressanini, Massimo Mella, Gabriele Morosi, and Luca Bertini have several papers using this form of the wavefunction for small molecules, and they get very good results. The papers can be found on Massimo Mella's publication list (edit 12/3/2005 - removed nonworking link). They are (this may not be a complete list):
- Many-electron correlated exponential wavefunctions. A Quantum Monte Carlo application to H2 and He2+
Chemical Physics Letters 240, 566 (1995)
Paper available at this Citeseer page - Nonadiabatic wavefunctions as linear expansions of correlated exponentials. A Quantum Monte Carlo application to H2+ and Ps2
Chem. Phys. Lett. 272, 370 (1997)
Paper available at this Citeseer page - Linear expansions of correlated functions: a variational Monte Carlo case study
Int. J. Quantum. Chem. 74, 23 (1999)
Paper available at this Citeseer page - Explicitly correlated trial wave functions in quantum Monte Carlo calculations of excited state of Be and Be-
J. Phys. B, (2001)
Not all N! terms will give a significant contribution to the result. Is it possible to use techniques similar to O(N) approaches to reduce the number of terms? Is it possible to sample terms from the sum? These might have trouble near the nodes, since the wavefunction vanishes due to the cancelation of all the terms.
DMC of porphyrin
An all electron calculation of a large molecule (large by QMC standards, anyway):
Quantum Monte Carlo for electronic excitations of free-base porphyrin by Alan Aspuru-Guzik, Ouafae El Akramine, Jeffrey Grossman, and William Lester. The porphyrin they used has the formula C20 N4 H14, so 168 electrons.
Wednesday, February 04, 2004
Solid vs. Liquid, The Free Energy Battle
Determining the free energy difference between the solid and liquid states is essential for accurately mapping the phase diagram of a substance. Typically, one determines the free energies of the solid and liquid states through separate thermodynamic integrations, where each integration has one end point connected to a state with an analytically known free energy. In the solid state, the Einstein crystal (springs attaching particles to lattice sites) is often used as a starting point. In the liquid state, the low density gas can be used as an end point. Other more complicated scalings of the potential are also possible for the integration path.
A direct integration path between the solid and liquid states would be useful.
Gregory Grochola has published one such path in "Constrained fluid lambda-integration: Constructing a reversible thermodynamic path between the solid and liquid state"(JCP 120, 2122).
The basic path is to start with the liquid state and turn off the intermolecular potential. Then turn on a potential constraining particles to lattice sites (He used a Gaussian potential). The last step is to turn off the lattice potential while turning the intermolecular potential back on.
He mentions at the end that this method is not very computationally efficient. I would expect using Bennett's method for free energy differences would cut down on the number of intermediate states needed.
Questions
- Compare and contrast this path with the phase switching Monte Carlo technique (there's a reference in the article)
- Thermodynamic integration requires there be no phase transitions on the path. Is it immediately obvious that this path has no transitions? Under what conditions, if any, would a phase transition occur on the path?
Saturday, January 24, 2004
Hard Spheres on the Move
Cluster algorithms increase efficiency for systems on a lattice by flipping large numbers of connected spins at once. In "Cluster Monte Carlo algorithms", Werner Krauth summarizes work on how such algorithms can be used in a continuum hard sphere simulation.
The basic idea is to rotate particles by &pi around a randomly chosen axis. (The key observation is that this transformation is its own inverse). A practical algorithm (the pocket algorithm) chooses a collection of particles that can be rotated without any hard spheres overlapping. It starts with a single particle and rotates it. If it overlaps with any other particles, they are added to the collection. Then the new particles are tested in the same way, and the collection is built up of all the particles that need to be rotated to prevent any overlaps.
At high densities, this algorithm will fail because the pocket cluster will contain all (or most) of the particles in the system, and simply rotate the entire configuration. Despite this limitation, there are systems where these moves are useful (see the paper for some examples).
On another subject, suppose you wanted to keep the trial box size fixed, and see how the dynamics varied with density (which would be useful if one were studying the actual MC dynamics) At higher densities, the efficiency would get quite bad. Hirosi Watanabe, et al tackle this problem in A rejection-free Monte Carlo method for the hard-disk system. They use geometry rather than random sampling to determine the allowable move area (and acceptance probability). To actually choose a point, they use a rejection method (it's not clear why this method is called "rejection-free", then. Maybe because the rejections are not essential for maintaining detailed balance, and rejected moves don't need to be included in the average.). At the step of choosing a new point, the area they choose from is greatly reduced from the original trial area.